ICD10 Coding of Death Certificates with the NCBO and SIFR Annotator(s) at CLEF eHealth 2017 Task 1

نویسندگان

  • Andon Tchechmedjiev
  • Amine Abdaoui
  • Vincent Emonet
  • Clement Jonquet
چکیده

The SIFR BioPortal is an open platform to host French biomedical ontologies and terminologies based on the technology developed by the US National Center for Biomedical Ontology (NCBO). The portal facilitates the use and fostering of terminologies and ontologies by offering a set of services including semantic annotation. The SIFR Annotator (http://bioportal.lirmm.fr/annotator) is a publicly accessible, easily usable ontology-based annotation tool to process French text data and facilitate semantic indexing. The web service relies on the ontology content (preferred labels and synonyms) as well as on the semantics of the ontologies (is-a hierarchies) and their mappings. The SIFR BioPortal also offers the possibility of querying the original NCBO Annotator for English text via a dedicated proxy that extends the original functionality. In this paper, we present a preliminary performance evaluation of the generic annotation web service (i.e., not specifically customized) for coding death certificates i.e., annotating with ICD-10 codes. This evaluation is performed against the CépiDC/CDC CLEF eHealth 2017 task 1 manually annotated corpus. For this purpose, we have built custom SKOS vocabularies from the CéPIDC/CDC dictionaries as well as training and development corpora, for all three tasks using a most frequent code heuristic to assign ambiguous labels. We then submitted the vocabularies to the NCBO and SIFR BioPortal and ran the annotation services on the task 1 datasets. We obtained, for our best runs on each corpus the following results: English raw corpus (69.08% P, 51.37% R, 58,92% F1); French raw corpus (54.11% P, 48.00% R, 50,87% F1); French aligned corpus (50.63% P, 52.97% R, 51.77% F1).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CLEF eHealth 2017 Multilingual Information Extraction task Overview: ICD10 Coding of Death Certificates in English and French

This paper reports on Task 1 of the 2017 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with coding of death certificates, as introduced in CLEF eHealth 2016. This largescale classification task consisted of extracting causes of death as coded in the International Classification of Diseases, tenth re...

متن کامل

SIBM at CLEF eHealth Evaluation Lab 2017: Multilingual Information Extraction with CIM-IND

This paper presents SIBM’s participation in the Task 1: Multilingual Information Extraction ICD10 coding of the CLEF eHealth 2017 evaluation initiative which focuses on named entity recognition in French and English death certificates. We addressed the identification of relevant clinical entities within the International Classification of Diseases version 10 (ICD10) in the CépiDC and CDC datase...

متن کامل

A Reproducible Approach with R Markdown to Automatic Classification of Medical Certificates in French

English. In this paper, we report the ongoing developments of our first participation to the Cross-Language Evaluation Forum (CLEF) eHealth Task 1: “Multilingual Information Extraction ICD10 coding” (Névéol et al., 2017). The task consists in labelling death certificates, in French with international standard codes. In particular, we wanted to accomplish the goal of the ‘Replication track’ of t...

متن کامل

KFU at CLEF eHealth 2017 Task 1: ICD-10 Coding of English Death Certificates with Recurrent Neural Networks

This paper describes the participation of the KFU team in the CLEF eHealth 2017 challenge. Specifically, we participated in Task 1, namely “Multilingual Information Extraction ICD-10 coding” for which we implemented recurrent neural networks to automatically assign ICD10 codes to fragments of death certificates written in English. Our system uses Long Short-Term Memory (LSTM) to map the input s...

متن کامل

Clinical Information Extraction at the CLEF eHealth Evaluation lab 2016

This paper reports on Task 2 of the 2016 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with named entity recognition and normalization in French narratives, as offered in CLEF eHealth 2015. Named entity recognition involved ten types of entities including disorders that were defined according to Sem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017